Automatic parallelization of fine-grained metafunctions on a chip multiprocessor
نویسندگان
چکیده
منابع مشابه
Enabling Parallelization via a Reconfigurable Chip Multiprocessor
While reconfigurable computing has traditionally involved attaching a reconfigurable fabric to a single processor core, the prospect of large-scale CMPs calls for a reevaluation of reconfigurable computing from the perspective of multicore architectures. We present ReMAPP, a reconfigurable architecture geared towards application acceleration and parallelization. In ReMAPP, parallel threads shar...
متن کاملFine-Grained Parallelization of a Vlasov-Poisson Application on GPU
Understanding turbulent transport in magnetised plasmas is a subject of major importance to optimise experiments in tokamak fusion reactors. Also, simulations of fusion plasma consume a great amount of CPU time on today’s supercomputers. The Vlasov equation provides a useful framework to model such plasma. In this paper, we focus on the parallelization of a 2D semi-Lagrangian Vlasov solver on G...
متن کاملCase study of gate-level logic simulation on an extremely fine-grained chip multiprocessor
Explicit-multi-threading (XMT) is a parallel programming approach for exploiting on-chip parallelism. Its fine-grained single program multiple data (SPMD) programming model is suitable for many computing intensive applications. In this paper, we present a parallel gate level logic simulator implemented on an XMT platform and study its performance. Test results show potential for achieving more ...
متن کاملFine-grained parallelization of lattice QCD kernel routine on GPUs
Simulation time for the classical problem of Lattice Quantum Chromodynamics (Lattice QCD) is dominated by one kernel routine responsible for computing the actions of a Dirac operator. This paper describes an experience in parallelizing this kernel routine. We explore parallelization granularities for this kernel routine on Graphical Processing Units (GPUs). We show that fine-grained parallelism...
متن کاملAutomatic Compilation to a Coarse-grained Reconfigurable System-on-Chip
The rapid growth of device densities on silicon has made it feasible to deploy reconfigurable hardware as a highly parallel computing platform. However, one of the obstacles to the wider acceptance of this technology is its programmability. The application needs to be programmed in hardware description languages or an assembly equivalent, whereas most application programmers are used to the alg...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM Transactions on Architecture and Code Optimization
سال: 2013
ISSN: 1544-3566,1544-3973
DOI: 10.1145/2541228.2541237